MMKnots: A max-margin model for RNA secondary structure prediction including pseudoknots
نویسندگان
چکیده
Motivation: The ideal algorithm for the prediction of pseudoknotted RNA secondary structures will provide fast and accurate predictions for pseudoknots of arbitrary complexity. However, existing algorithms are typically lacking on one of these three axes. Energy-based methods suffer from the intractability of pseudoknotted structure prediction under realistic energy models, while statistical approaches struggle with inference on the vast space of possible structures. Results: In this paper, we present MMKnots, an algorithm for secondary structure prediction including pseudoknots. MMKnots leverages a max-margin framework to train a model with a simple scoring scheme, which then enables the use of efficient (approximate) algorithms at prediction time. Experiments on datasets not observed in training show that MMKnots outperforms the state-of-the-art pseudoknot prediction algorithm without sacrificing speed. Furthermore, whereas existing algorithms tend to be conservative in predicting pseudoknots, MMKnots can include these complex interactions more easily due to the flexibility of the model representation. MMKnots illustrates the potential of this framework in excelling on the three major criteria of accuracy, generality and speed. Availability: MMKnots will be made available at http://ai. stanford.edu/ ̃csfoo/mmknots upon publication. Contact: [email protected]
منابع مشابه
RNA Secondary Structure Prediction Algorithms
RNA secondary structure prediction is an important problem studied extensively in the past three dacades. However, pseudoknots are usually excluded in RNA secondary structure prediction due to the hardness of examining all possible structures efficiently and model the energy correctly. Current algorithms on predicting structures with pseudoknots usually have extremely high resource requirements...
متن کاملP-dcfold or How to Predict all Kinds of Pseudoknots in Rna Secondary Structures
Pseudoknots play important roles in many RNAs. But for computational reasons, pseudoknots are usually excluded from the definition of RNA secondary structures. Indeed, prediction of pseudoknots increase very highly the complexities in time of the algorithms, knowing that all existing algorithms for RNA secondary structure prediction have complexities at least of O(n). Some algorithms have been ...
متن کاملP-DCFold: an algorithm for RNA secondary structure prediction including all kinds of pseudoknots
Pseudoknots in RNA secondary structures play important roles, but unfortunately, their prediction is a very difficult task. The prediction of a RNA secondary structure still being not resolved, even when it does not contain pseudoknots. Many algorithms have been proposed, but almost of them are not satisfactory in results and complexities. Particularly, when pseudoknots are taken into account, ...
متن کاملDotKnot: pseudoknot prediction using the probability dot plot under a refined energy model
RNA pseudoknots are functional structure elements with key roles in viral and cellular processes. Prediction of a pseudoknotted minimum free energy structure is an NP-complete problem. Practical algorithms for RNA structure prediction including restricted classes of pseudoknots suffer from high runtime and poor accuracy for longer sequences. A heuristic approach is to search for promising pseud...
متن کاملDP Algorithms for RNA Secondary Structure Prediction with Pseudoknots
This paper describes simple DP (dynamic programming) algorithms for RNA secondary structure prediction with pseudoknots, for which no explicit DP algorithm had been known. Results of preliminary computational experiments are described too.
متن کامل